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1  Introduction 

This  is  the  final  report  for  the  Phase  I  project  ”  Realtime  Control  of  Multiple 
Sensor  Systems,”  (DAAL03-92-C-0025).  In  Phase  I  we  defined  and  developed 
an  analytical  framework  for  solution  of  a  class  of  sensor  scheduling  problems. 
We  developed  prototype  software  tools  for  their  numerical  analysis.  We  also 
developed  special  solution  techniques  for  sensor  scheduling  in  linear  Gaus¬ 
sian  systems  and  for  Gaussian  signals  observed  through  nonlinear  functions. 
These  results  establish  the  feasibility  of  sensor  scheduling  methodologies  for 
a  class  of  sensor  and  signal  processing  models. 


1.1  Sensor  Scheduling  and  Signal  Processing 

Sensor  technology  for  military  (and  civilian)  applications  has  undergone 
rapid  development  during  the  past  decade.  Improvements  and  new  devel¬ 
opments  include  focal  plane  electro-optical  arrays,  electronically  scanned  ar¬ 
rays,  multistatic  operational  modes,  active-spread  spectrum-waveforms,  and 
automatic  target  recognition  systems.  Conventional  systems  like  radar  have 
been  improved,  and  new  technologies  have  been  placed  into  operation  like 
Infrared  Search  and  Track  (IRST),  Electro-Optical  (EO)  sensors,  and  various 
type  of  Electronic  Support  Measures  (ESM)  with  Automatic  Target  Recog¬ 
nition  capability.  Countermeasures  and  counter-countermeasures  have  also 
been  improved  and  the  operational  environment  has  become  increasingly  de¬ 
manding. 

The  increased  use  of  multiple  sensor  systems  on  platforms  has  led  to 
changes  in  the  way  sensors  implemenations  and  operations  are  designed  and 
used.  In  traditional  single  sensor  systems  the  decision  processes  (detection. 
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estimation,  tracking,  classification,  etc.)  could  be  located  within  the  process¬ 
ing  unit  of  the  sensor.  In  effect,  the  decision  process  is  part  of  the  sensor 
system,  and  it  works  with  the  low  dimensional  signals  generated  by  the  sensor 
signal  processing. 

In  contrast,  in  a  multi-sensor  system,  each  sensor  is  only  a  contributer  to  a 
composite  decision  process.  This  data  fusion  process  lies  outside  the  sensor’s 
processing  capability.  It  brings  a  new  element  to  the  design  of  sensor  systems 
in  which  the  operational  performance  of  each  individual  sensor  is  important 
only  as  an  element  of  the  whole  [17].  Sensor  fusion  is  the  aggregation  of  the 
information  from  all  the  sensors.  As  currently  used,  sensor  fusion  does  not 
involve  the  active  “control”  of  the  sensor  elements. 

As  used  in  our  work  sensor  scheduling  and  control  involves  the  simultane¬ 
ous  selection,  based  on  quantitative  performance  measures,  of  a  configuration 
of  sensors  (from  a  collection  of  sensors)  to  collect  data  and  associated  sig¬ 
nal  processing  (detection/estimation)  schemes  for  the  individual  sensors  in 
the  active  configuration.  Sensor  management  in  this  sense  is  a  key  concern 
in  design  and  operation  of  multiple  sensor  platforms  and  distributed  sensor 
networks  in  both  military  and  commercial  applications.  Platforms  having 
multiple  sensors  (e.g.,  satellites  or  aircraft  with  radar,  infrared,  video  de¬ 
tectors,  ESM,  comm  links,  etc.)  must  manage  the  sensor  configuration  and 
coordinate  (“fuse”)  the  data  obtained  from  the  sensors  in  use  at  any  time. 
The  data  may  vary  in  quality  as  a  function  of  the  system  and  target  state. 

Scheduling  and  manipulating  (controlling  the  parameters  of)  the  sensor 
suite  as  a  function  of  time  and  state  is  necessary  to  obtain  the  best  over¬ 
all  performance  (detection,  estimation,  and  identification).  Given  a  sensor 
configuration,  algorithms  are  required  for  allocating  confidence  or  making 
decisions  based  on  data  collected  from  different  types  of  sensors  -  in  real 
time.  These  are  the  two  main  issues  addressed  in  this  research. 

For  example,  radar  sensors  are  more  accurate  than  infrared  sensors  for 
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rajige  tracking,  but  infrared  generally  supplies  better  bearing  data.  Detecting 
a  target  and  classifying  it  using  the  two  types  of  sensors  involves  decisions  to 
activate  the  radar,  to  control  its  parameters  (sweep  space,  range  gate,  etc.), 
and  to  accept  or  reject  hypotheses  in  real  time  based  on  the  two  types  of  infor¬ 
mation  being  provided  by  the  sensors.  This  process  involves  not  only  sensor 
data  fusion  in  the  conventional  sense;  but  also  dynamic  feedback  control  of 
the  sensor  system  itself,  including  on-off  activation  of  the  active  (emissions) 
sensor,  and  optimal  control  of  its  parameters  while  active.  There  can  also  be 
continuous  control  in  the  signal  processing  algorithms  (e.g.,  automatic  gain 
control) .  Constraints  on  emissions  to  limit  self-exposure  must  be  taken  into 
account  in  using  the  active  sensor (s).  Limiting  the  search  space  can  improve 
the  detection  of  targets  in  that  space,  but  hinder  the  detection  of  targets 
elsewhere. 

In  sensor  networks  (e.g.,  satellite  surveillance  networks  or  underwater 
sonar  arrays)  one  needs  to  coordinate  data  collected  from  many  sensors  dis¬ 
tributed  over  a  large  geographical  area,  and  in  some  cases  manage  the  com¬ 
munications  among  the  sensors.  Conflicts  must  be  resolved  and  a  preferred 
set  of  sensors  selected  over  finite  (short)  time  intervals,  and  utilized  in  detec¬ 
tion,  estimation,  or  control  decisions.  Similarly,  large  scale  industrial  (e.g., 
chemical)  processes  include  an  information  network  for  collecting  and  pro¬ 
cessing  data,  and  making  the  results  available  to  operators  and  automatic 
controllers  for  actions.  The  need  to  collect  and  coordinate  this  information 
in  a  systematic  way  is  critical  to  effective  and  efficient  operation  of  the  plant. 

In  these  problems  the  dynamic  management  and  scheduling  of  sensors 
may,  in  principle,  be  based  on  optimization  of  certain  performance  measures. 
These  performance  measures  should  include  terms  allocating  penalties  for  er¬ 
rors  in  detection  and/or  estimation.  They  must  also  include  terms  for  costs 
associated  with  operating  the  sensors;  e.g.,  turning  sensors  on  or  off,  and  for 
switching  from  one  sensor  configuration  to  another.  For  example,  turning 
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on  an  active  radar  sensor  increases  the  detectability  of  the  platform,  and 
this  should  be  reflected  as  a  switching  cost.  In  certain  applications,  using  a 
more  accurate,  more  complex  sensor,  may  require  higher  bandwidth  commu¬ 
nications  and/or  more  computational  resources  allocated  to  that  sensor.  If 
quantifled,  these  costs  can  be  included  in  the  control  of  the  sensor  systems. 

In  distributed  networks  sensor  scheduling  may  mean  the  physical  move¬ 
ment  of  a  platform  (such  as  a  helicopter  or  airplane)  to  a  particular  geograph¬ 
ical  location.  In  large  scale  systems  use  of  several  (often  hundreds)  sensors 
for  decision  making  may  provide  better  average  performance,  but  may  re¬ 
duce  the  response  time  of  the  system  to  changing  conditions;  and  increase 
computational  and  communication  costs  both  in  terms  of  hardware  and  soft¬ 
ware.  The  latter  are  obviously  evident  in  large  computer/communication 
networks.  The  values  of  the  operating  and  switching  costs  can  depend  the 
values  of  the  state  vector.  For  example,  certain  types  of  sensors  have  ac¬ 
curacy  or  noise  characteristics  that  vary  as  a  function  of  the  values  being 
measured.  There  are  costs  associated  with  the  transfer  of  information,  or 
tracking  records,  when  there  are  changes  in  the  set  of  sensors  used.  These 
costs  can  also  depend  on  the  state  process. 


1.2  Sensor  Scheduling  and  Sensor  Fusion 

Sensor  fusion  generally  refers  to  the  aggregation  of  information  from  several 
sensors  to  produce  statistical  determinations  (detection,  estimation,  etc.) 
superior  to  the  determinations  achieved  from  any  single  operational  sensor. 
Sensor  fusion,  interpreted  in  this  way,  is  a  component  of  the  sensor  scheduling 
formalism. 

In  its  most  classical  form  sensor  fusion  involves  an  aggregate  hypothesis 
testing  procedure  (see,  e.g.,  in  [18,  38,  40]).  Several  sensors  receive  and  pro- 
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cess  data  from  some  phenonmenon,  each  performing  statistical  evaluations 
of  the  data.  For  example,  each  sensor  might  implement  a  Neyman-Pearson 
(NP)  test  on  its  data  stream  to  decide  between  two  hypotheses 
The  decisions  are  transmitted  to  a  “fusion  center”  where  further  processing 
is  done  to  arrive  at  the  “best  decision.”  For  example,  the  fusion  center  could 
implement  a  NP  test  on  the  sensor  decisions.  Using  this  formulation,  it  is 
possible  to  show,  for  example,  that  a  configuration  of  N  similar  sensors,  each 
characterized  by  the  same  probability  of  false  alarm,  probability  of  detection 
pair  {Pp,  Pd),  can  achieve  aggregate  (Pp,  Pd)  superior  to  the  best  individual 
sensor  if  iV  >  3  [40].  If  the  sensors  transmit  “quality  of  estimate”  (con¬ 
fidence  level)  information  together  with  the  decisions  to  the  fusion  center, 
then  performance  can  be  further  improved  [17,  40]. 

Defined  in  this  way,  the  sensor  fusion  process  may  be  regarded  as  pro¬ 
viding  information  used  in  sensor  management  as  a  dynamical  process.  For 
example,  the  decision  to  activate  additional  sensors  could  be  based  on  the  fail¬ 
ure  of  the  aggregate  NP  test  at  the  fusion  center  to  provide  a  decision  with 
a  sufficiently  high  confidence  level  using  available  information.  Similarly, 
sensors  with  controllable  parameters  (sweep  rate,  sweep  volume)  could  be 
commanded  to  change  those  parameters  to  conduct  a  more  effective  surveil¬ 
lance.  If  sensor  activation  or  control  involves  exposure  through  emission  of 
radiation  or  limits  the  detection  of  new  targets  by  concentration  of  activ¬ 
ity,  then  these  “costs”  can  be  weighed  against  the  detection  and  estimation 
improvements  gained  by  active  control. 

Sensor  fusion  algorithms  are  generally  developed  based  on  static  infor¬ 
mation  (as  described  above)  which  may  include  all  the  information  gathered 
to  present  time.  While  the  signal  processing  algorithms  used  at  the  sensors 
might  be  recursive  (e.g.,  Kalman  filtering),  the  processing  at  the  fusion  center 
is  static;  it  generally  does  not  update  the  likelihood  (NP)  test  recursively.^ 
*In  our  terminology  a  recursive  process  is  one  which  updates  its  state  at  any  time  based 
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Figure  1.1:  Sensor  fusion,  scheduling,  and  control. 

In  contrast,  our  formulation  of  sensor  scheduling  is  essentially  a  dynamical 
process  leading  to  a  feedback  control  strategy  to  manage  the  sensor  configu¬ 
ration.  The  diagram  in  Figure  (1.1)  illustrates  the  relationship  between  these 
views  of  sensor  scheduling  and  sensor  fusion. 

on  the  previous  state  and  the  new  information  available.  It  does  not  use  all  past  infor¬ 
mation,  except  as  summarized  in  the  “state  variables.”  The  Kalman  filter  is  a  reciu^ive 
algorithm,  the  Wiener  filter  is  not.  Recursive  algorithms  require  fixed  memory  size  (the 
state  dimension);  non-recursive  algorithms  require  “growing  memory.” 
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1.3  Sensor  Scheduling  and  Man-in-the-Loop  Systems 

In  many  operational  environments  a  human  decision  maker  interacts  with  the 
sensor  systems  on  a  continous  basis;  for  example,  the  Radar  Intercept  Officer 
in  tactical  aircraft.  The  human’s  function  includes  subjective  fusion  of  the 
information  presented  by  the  sensors  to  make  determinations  about  the  state 
of  the  operating  environment.  We  shall  briefly  comment  on  an  example  of 
the  design  of  interactive  multi-sensor  management  systems  (the  KOALAS 
architecture)  to  illustrate  the  role  played  by  a  sensor  scheduling  subsystem 
in  the  man-machine  interaction.  This  is  important  for  some  applications  of 
multiple  sensor  systems  like  ground  based  control  surveillance  networks  (of 
satellites);  and  in  civilian  systems  like  criminal  entry  surveillance  systems. 

In  the  work  described  in  [6,  23,  24]  the  interactive  architectures  developed 
by  Barrett  and  his  colleagues  [5]  for  real  time  decision  support  system  have 
been  applied  to  the  development  of  sensor  utilization  systems  for  tactical 
aircraft.  In  this  architecture,  the  human  operator  plays  a  key  role  in  the 
sensor  management  process.  Based  processed  data  from  the  sensors  (a  subset 
of  the  total),  the  operator  can  inject  “hypotheses”  into  the  system,  altering 
the  sensor  configuration  to  evaluate  specific  possibilities. 

Thus,  if  the  infrared  search  and  track  sensor  (IRST)  indicates  a  target  on 
a  heading,  the  radar  can  be  activated  to  provide  a  range  measurement,  or  if 
already  active,  its  sweep  volume  can  be  revised  to  focus  along  the  bearing 
indicated  by  IRST.  In  the  implementations  discussed  in  [6,  23]  when  a  hy¬ 
pothesis  is  entered  by  the  operator,  the  sensor  configuration  is  selected  (by 
a  rule  based  expert  system)  to  “best”  evaluate  this  hypothesis.  The  expert 
system  contains  knowledge  about  sensor  use  provided  by  subject  matter  ex¬ 
perts.  The  overall  architecture  for  this  interactive  decision  support  process 
is  called  KOALAS  (Knowledgeable,  Observation  Analysis-Linked  Advisory 
System). 
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Our  formulation  of  sensor  scheduling  could  embedded  naturally  into  the 
KOALAS  architecture  (or  other  man-in-the-loop  sensor  fusion  systems  with 
similar  function).  The  scheduling  algorithm  plays  a  role  analogous  to  the 
expert  system  in  selecting  the  best  sensor  suite  configuration  to  evaluate  a 
hypothesis.  The  selection  is  based  on  quantitative  models  and  performance 
criteria.  Clearly,  in  situations  where  such  models  are  not  available,  or  in  those 
where  subjective  factors  (threat  intent)  dominate  the  engagement,  quantita¬ 
tive  algorithms  may  not  be  appropriate.  However,  if  signal  and  target  mod¬ 
els  are  available  and  useful,  the  sensor  scheduling  is  useful.  For  autonomous 
platforms  it  is  perhaps  the  best  choice. 

In  Figure  (1.2)  we  show  the  KOALAS  architecture  modified  to  incorporate 
a  sensor  scheduling  block.  In  the  diagram  all  elements  inside  the  dashed  line 
are  automatic.  The  operator  interaction  takes  place  through  the  interface. 
The  scheduler  functions  as  an  automatic  controller  to  manage  the  sensor 
system  in  respond  to  inputs  from  the  operator  and  feedback  signals  from 
the  signal  processing  subsystem.  In  the  absence  of  operator  inputs  it  would 
control  the  sensor  system  “automatically”  using  the  algorithms  described  in 
the  next  section. 

An  operator  input  could  serve  as  a  manual  control,  or  as  a  constraint  on 
the  scheduling  controller.  Alternately,  the  operator  input  could  determine 
the  specific  cost  structure  used  to  determine  the  optimal  sensor  configuration; 
e.g.,  weigh  radar  performance  10  times  more  heavily  than  IRST  performance 
as  components  of  the  overall  sensor  system  performance  index.  Given  the 
performance  index  (weightings)  the  scheduler  would  compute  and  implement 
the  optimal  configuration. 

This  same  philosphy  could  be  used  for  autonomous  vehicles  in  certain  op¬ 
erational  environments.  For  example,  a  priori  knowledge  about  anticipated 
operational  scenarios  could  be  stored  in  the  system.  When  a  particular  sce¬ 
nario  is  detected  (e.g.,  atmospheric  versus  extra-atmospheric  or  deep  versus 
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Figure  1.2:  Sensor  scheduling  in  a  man-in-the-loop  system. 
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shallow  water  operations) ,  then  the  sensor  performance  index  could  be  modi¬ 
fied  to  refiect  the  scenario.  The  scheduling  algorithm  then  solves  for  the  best 
configuration,  given  the  performance  index,  and  indirectly,  the  operational 
scenario. 

In  summary,  the  theme  in  our  formulation  is  management  of  a  system 
of  sensors  providing  data  (for  signal  processing),  including  information  of 
widely  varying  quality  about  parameters  or  variables  of  interest,  for  control, 
detection,  estimation,  and  information  fusion  etc.  The  goal  of  the  research 
is  to  develop  systematic  conceptual,  analytical,  and  numerical  methods  for 
their  treatment. 

Our  specific  objective  is  to  quantify  the  scheduling/management  proce¬ 
dures  and  develop  algorithms  for  sensor  schedulng  and  control.  In  Phase  I, 
we  demonstrated  how  this  may  be  done  for  certain  types  of  sensors  and  signal 
processing  algorithms.  We  showed  that  the  optimal  sensor  scheduling  prob¬ 
lem  can  be  described  as  an  impulse  control  problem  with  partial  observations 
in  a  finite  dimensional  state  space.  We  then  showed  that  the  partially  ob¬ 
served  (stochastic  control)  problem  can  be  transformed  into  a  linear  control 
problem  with  perfect  observations  evolving  on  an  infinite  dimensional  state 
space.  We  showed  that  the  transformed  optimal  impulse  control  problem  can 
be  solved  by  a  system  of  quasi-variational  inequalities  (QVIs);  the  analog  of 
dynamic  programming  for  control  problems  with  discrete  valued  controls  and 
switching  costs. 

In  Phase  I  we  implemented  prototype  numerical  algorithms  to  solve  these 
QVIs.  Solution  of  the  QVIs  leads  not  only  to  the  optimal  cost  functions; 
but  also  to  the  optimal  switching  schedule  expressed  in  terms  of  switching 
and  continuation  sets  in  the  state  space.  We  also  developed  a  simplified 
scheduling  strategy  for  (linear)  sensor  systems  operating  on  Gaussian  signals. 
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2  Summary  of  Phase  I  Work 

2.1  A  Model  for  Sensor  Scheduling  for  Diffusion  Processes 

The  basic  model  used  in  our  formulation  of  optimal  sensor  scheduling  is 
as  follows:  A  signal  (or  state)  process  x{‘)  is  given,  modeled  by  the  vector 
valued,  nonlinear  diffusion  process 

dx{t)  =  f{x{t))dt  -I-  g{x{t))dw{t),  a:(0)  =  ^,  t  £  [0,T]  (2.1) 

in  IR".  We  consider  M  noisy  observations  of  x(-),  described  by 

dy’(f)  =  h\x{t))dt  -I-  ny^dv^t),  y'{0)  =  0  (2.2) 

with  values  in  Here  w{‘),  «’(•)  are  independent  Wiener  processes  and 
Ri  =  RJ  >  0  are  d,-  x  d,  matrices. 

For  linear  dynamics,  these  equations  take  the  form 

dx{t)  =  A{t)x{t)dt  +  B{t)dw{t),  x(0)  =  ^,  f6[0,r]  (1') 

and 

dy’(t)  =  H\t)x{t)dt  +  Ry^dv^t),  y’(0)  =  0.  (2') 

Thus,  due  to  the  linearity  of  the  problem,  the  state  x{t)  and  observations 
y{t)  will  be  Gaussian. 

The  sensor  scheduling  problem  is  to  determine  an  optimal  utilization 
schedule  for  the  available  sensors,  so  as  to  simultaneously  minimize  the  cost 
of  errors  in  estimating  a  function  of  x(-)  and  the  costs  of  using  and  switching 
between  various  sensors.  We  need  to  specify  these  costs.  Let  c,(x)  denote 
the  cost  per  unit  time  when  using  sensor  i,  and  the  state  of  the  system  is 
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x;  let  kio{x),  koi{x)  denote  the  cost  for  turning  off,  respectively  on,  the 
sensor  when  the  state  of  the  system  is  x.  The  signal  processing  objective  is 
to  compute,  at  time  T,  an  estimate  ^(T)  of  a  given  function  ^(x(T))  of  the 
state.^  The  estimation  error  (cost)  is  defined  by 

E{c,Mx(T))  -  ^(T)))  ;=  £:(|,^(x(T))  -  ^(Tf)  (2.3) 

We  use  to  denote  the  set  of  all  possible  sensor  activation  configurations. 

An  element  i/  e  A/"  is  a  word  of  length  M  from  the  alphabet  {0,  1}.  If  the 
position  in  u  is  occupied  by  an  1 ,  the  sensor  is  activated  (used) ,  if  by  a  0 
the  sensor  is  off.  There  are  N  =  2^  elements  in  Af.  A  sensor  schedule  is 
a  piecewise  constant  function  «(•)  :  [0,  T]  — ^  Af. 

We  let  Tj  e  [0,T]  denote  the  switching  times  in  the  schedule;  i.e.,  the 
times  when  at  least  one  sensor  is  turned  on  or  off.  Suppose  the  schedule 
before  a  switch  is  i/  €  A/",  and  i/  6  Af  is  the  schedule  after  the  switch.  Then 
the  associated  switching  cost  is 


k^^i{x)  kio[x)  -f  ^  koj{x).  (2-4) 

The  total  operating  cost,  associated  with  use  of  the  sensor  schedule  u  £  Af 
is 


c^{x)  :=  Cj{x)  (2.5) 

In  (2.4),  (2.5),  the  symbol  {i  €  u}  denotes  the  set  of  all  indices  (from  the 
set  {1, 2, ... ,  M)  which  are  occupied  by  al  in  v  (i.e.,  the  indices  corresponding 
to  the  sensors  which  are  on);  similarly  the  symbol  {i  ^  u}  denotes  the  set  of 
indices  of  sensors  that  are  off. 

^  State  estimation  is  discussed  here  as  a  generic  application  of  sensor  signal  processing. 
Detection  and  hypothesis  testing  could  be  treated  by  methods  similar  to  those  discussed. 
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The  observations,  under  sensor  schedule  u{-)  are 


dy{t^u{t))  ;=  h{x{t)yu{t))dt  +  r{u{t))dv{t). 


(2.6) 


where  it  is  apparent  that  the  available  observations  depend  explicitly  on  the 
sensor  schedule  m(-)-  In  (2.7),  for  x  G  52",  u 


h{x,i/)  := 


(2.7) 


L  /i"(^)XM(M)  J 

a  block  column  vector,  where  in  standard  notation 


-  {  0,  " 


the  position  in  the  word  u  is  occupied  by  an 
otherwise 


‘1 

(2.8) 


Similarly  for  u  £  M 


r{u)  :=  Block  diagonal{i2i^^X{i'}(0}»  (2-9) 

where  i2,  are  the  symmetric,  positive  matrices  defined  above.  In  (2.7) 


i;(t)  := 


v^{t) 


(2.10) 


v^{t)  J 

is  a  Wiener  process.  Based  (2.8),  for  all  */  €  .V 


JR^,  r{u):lR^  ^  £>  =  di -H  da  +  •  •  •  +  (2.11) 
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As  an  example,  consider  the  case  M  =  2,  N  =  4.  Then  Af  =  {00, 01,10,11} 
and 


h{x,0Q)  = 


0 

0 


/»(ar,01)  = 


0 

h^{x) 


(2.12) 


while 


h{x, 10)  = 


r(00)  = 


h^{x) 

0 

0  0 
0  0 


h{x,  11)  = 


h^{x) 

h'^{x) 


r(01)  = 


0  0 
0  R 


1/2 


r(10)  = 


r(ll)  = 


0 


R\^^  0 


(2.13) 


0 


1/2 


Clearly  the  dimension  of  the  range  space  of  i/(-,i/)  is  di  X{i/}(0  ■ 

Also,  for  all  i/,  y{t,  v)  G  IR^. 

Following  established  terminology  for  switching  (impulse)  control  prob¬ 
lems  (c.f.  [10]),  we  see  that  a  sensor  scheduling  strategy  is  defined  by  an 
increasing  sequence  of  switching  times  Tj  €  [0,  T]  and  the  corresponding  se¬ 
quence  Vj  €  A/^  of  sensor  activation  configurations.  We  shall  denote  such  a 
strategy  by  «(•),  where 


Uj,  te  [tj,  Tj+i);  j  =  1, 2, . . .  (2.14) 

The  sensor  scheduling  problem  involves  the  simultaneous  minimization  of 
costs  due  to  estimation  errors  and  sensor  system  operation.  The  estimation 
and  scheduling  processes  are  interrelated,  and  we  must  therefore  consider 
joint  estimation  and  sensor  scheduling  strategies.  Such  a  strategy  consists  of 
two  inseparable  parts:  the  sensor  scheduling  strategy  u  (see  (2.15))  and  the 
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state  estimator  The  set  of  admissible  strategies  Uad  is  the  customary  set 
of  strategies  adapted  to  the  sequence  of  observations  defined  in  terms  of  the 
a-algebras 


a  <  t}.  (2.15) 

That  is,  we  consider  strict  sense  admissible  controls  in  the  sense  of  [19]. 
This  statement  must  be  interpreted  carefully.  First,  as  indicated  in  (2.15),  the 
observation  data  depends  (as  is  evident  from  (2.7)  -  (2.10))  strongly  on  the 
sensor  schedule  «(•).  This  dependence  is  non-standard,  since  the  dimensions 
of  the  observation  vector  and  the  noise  covariance  change  drastically  at  each 
switching  time  Tj.  In  standard  stochastic  control  formulations  [19,  12],  the 
dependence  of  observations  y  on  controls  u(*)  is  generally  implicit.  This 
is  a  key  difficulty  of  the  sensor  scheduling  problem  which  limits  the  use  of 
standard  techniques  (e.g.,  Girsanov  transformations).  Second  (2.15)  means 
that  the  switching  times  r,  and  the  sensor  configurations  i/j,  which  define 
a  schedule  «(•),  must  be  adapted  to  the  observation  data  which 

itself  depends  essentially  on  the  values  of  r,  and  i/,.  This  is  analogous  to 
“free-boundary”  problems  in  mathematical  physics  where  the  location  of  the 
boundary  conditions  depend  on  the  solution.  As  we  shall  see  later,  the  (QVI) 
conditions  which  define  the  optimal  switching  cost  involve  free  boundaries 
also. 

Given  a  sensor  schedule  «(•),  the  corresponding  cost  is 

(2.16) 

The  three  terms  correspond  to  the  estimation  error,  the  operational  cost  of 
the  sensors,  and  the  cost  of  switching  among  sensor  configurations. 

Here  for  x  €  2R",  c{x,u)  :=  c„(x),  and  k{x,  u,  i/)  =  k^y{x),. 
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In  summary,  the  optimal  sensor  scheduling  including  nonlinear  filtering 
is  therefore  the  determination  of  a  strategy  achieving 

inf  J(u(-),0) 

.  ^  ^  (2.17) 

u(-),0 

among  all  admissible  strategies. 

This  problem  involves  the  simultaneous  estimation  and  optimal  control/- 
scheduling  of  a  system  with  partial  observations.  Using  recent  research  in  the 
control  theory  of  stochastic  systems  with  partial  observations,  it  is  possible 
to  convert  the  problem  (2.17)  into  an  optimal  control  problem  with  perfect 
observations  of  the  conditional  density  of  the  signals.  The  conditional  density 
functions  evolve  on  an  infinite  dimensional  space,  which  increases  the  diffi¬ 
culty  in  solving  the  system;  however,  they  satisfy  a  linear  (Zakai)  equation. 
The  conversion  is  described  below. 

To  simplify  the  presentation,  we  order  the  elements  of  M  according  to  the 
numbers  they  represent  in  binary  form.  For  example  in  the  case  M  =  2,  N 
=  4  we  replace  N  =  {00,01, 10, 11}  by  the  set  of  integers  {1,  2,  3,  4}.  That 
is,  the  one-one  correspondence  between  M  and  (1,2,...,  N)  is  described  by 


1/  I — >  (integer  represented  by  i/)  1  (2.18) 

k  I — >  binary  representation  of  (fc  —  1). 

This  permits  us  to  replace  all  the  i/,  i/  by  the  corresponding  integers  from 


2.1.1  Transformation  of  the  Control  Problem 

The  transformation  is  given  two  steps:  (i)  reformulate  the  scheduling  problem 
as  a  “standard”  impulse  control  problem;  (ii)  use  a  probability  (Girsanov) 
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transformation  to  rewrite  the  cost  function  in  terms  of  the  conditional  density 
which  becomes  the  state  for  the  control/scheduling  problem. 


Reformulation  as  an  Impulse  Control  Problem:  Consider  an  impul¬ 
sive  control  defined  as  follows:  There  is  a  sequence  Ti  <  T2  ...<  Tk  <  ...  of 
increasing  stopping  times.  To  each  time  Ti  we  associate  an  -measurable 
random  variable  u,-  with  values  in  the  set  of  integers  {  1,  2,  . . . ,  N}.^  We 
define 


u{t)  =  Uj,  Ti  <t  <  r,+i,  :  =  0, 1, 2, . . .  (2.19) 

and  set  tq  =  0.  We  require  that  t  T*  as  i  t  oo>  with  Tk  =  T  possible  for 
some  finite  k. 

Let  Ui  be  the  element  of  corresponding  to  Uj  via  (2.18).  Then  define 

h{x,u{t))  :=  h{x,Ui),  Ti  <t  <  r,+j,  (2.20) 

where  h{x,  u)  is  defined  by  (2.8),  in  terms  of  the  given  functions  /i'(-)-  Clearly 
h{-,u{t))  maps  iR"  into  IR^  for  all  sensor  schedules  u(  ).  and  is  obviously 
bounded  and  Holder  continuous  in  x.  Define  also 

r{u{t))  :=  r{ui),  Ti<t<  r.+i,  (2.21) 

where  r(-)  is  defined  by  (2.10),  in  terms  of  the  given  matrices  R,,  i  = 
1,2, . . . ,  M.  Clearly  r(u(t))  maps  into  IR^  for  all  sensor  schedules  u(  ) 
but  it  is  singular.  Define  h{x,  u)  to  be  the  vector  valued  function 

’Recall  that  N  —  2^  and  the  binary  representation  of  each  integer  1, 2, . . . ,  AT  deter¬ 
mines  a  sensor  activation  configuration  by  (2.18). 
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h{x,u)  := 


R,  ‘'^h<(x)x{„)(i) 


(2.22) 


Ru''^h«(x)xM(M) 
with  X{i/}(0  defined  as  in  (2.9).  Let 

h(x,u(t))  :=  h{x,Ui),  r,<t<r,+i. 


(2.23) 


Clearly  h{-,  u{t))  maps  iR"  into  2R^  for  all  sensor  schedules  n(-).  We  refer 
to  u{')  as  the  impulsive  control.  It  defines  the  decision  to  select  one  of  the 
functions  h{-,k),  k  €  {1,2, at  a  sequence  of  decision  times.  This 
provides  a  mathematical  formulation  of  the  sensor  selection  decision  process. 

To  simplify  the  notation  we  take  4>{x)  =  x.  For  this  choice  the  selection 
of  the  optimal  estimator  0(T)  is  the  conditional  mean 


0(r)  =  {x(T)  I  (2.24) 

where  denotes  expectation  with  respect  to  ,  the  probability  defined 
by  the  sensor  schedule.  Let  p{u,  t)  denote  the  conditional  probability  measure 
of  x{t),  given  on  ZR".  It  is  convenient  to  express  (2.24)  as  a  vector 

valued  functional  of  /z(«,  t) 

0(r)  =  $(/Li(u,T))  =  /  xdp{u,T).  (2.25) 

This  transformation  permits  us  to  rewrite  the  cost  as  a  function  of  the 
impulsive  control  u(-)  only  (i.e.,  the  selection  of  the  estimator  0(-)  has  been 
eliminated): 


Jo 
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+  5^  fc(x(rj),«{T,_i,u(r,)xr,<T},  (2.26) 

i=i 

where  XriKT  is  the  characteristic  function  of  the  fi-set  {w;  t,(u>)  <  T}. 

To  prevent  zero  cost  switching  cycles,  we  assume  that  the  switching  costs 
are  uniformly  bounded  below 

Hx,i,j)>ko,  are  2R",  X,  i  €  {l,...,Ar}  (2.27) 

with  ko  a  positive  constant.  As  a  consequence  of  this  assumption,  for  T  finite 
the  optimal  policy  will  exhibit  a  finite  number  of  sensor  switchings. 

The  optimal  sensor  selection  problem  can  now  be  stated  precisely  as  the 
optimization  problem:  V  :  Find  an  admissible  impulsive  control  u*(-)  such 
that 


J(u*(.)) 


inf 

u{')  €  Uad 


(2.28) 


where  Uad  are  all  impulsive  control  strategies  adapted  to  .  Problem 

■p  is  a  non-standard  (because  the  dimension  of  the  state  changes)  stochastic 
control  problem  of  a  partially  observed  diffusion. 


The  equivalent  fully  observed  problem;  Next  we  transform  the  im¬ 
pulse  control  problem,  to  a  fully  observed  stochastic  control  problem,  by 
introducing  an  evolution  equation  for  the  conditional  density.  Based  on  the 
theory  of  nonlinear  filtering,  consider  the  operator 

p(u(,-),t)W  =  I  jrvM-n-j  (2.29) 

for  each  impulsive  control  ti(-)  where  the  functional  C  is  defined  by 

C(t)=exp{j^  h{x{s),u{s)fdz{s)-^j^  ||/i(x(s), w(s))||2ds} 
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where  ^  denotes  transpose,  ||  •  ||  is  the  IR^  norm. 

The  notation  is  chosen  so  as  to  emphasize  the  dependence  on  «(•).  The 
operator  t)  is  the  unnormalized  conditional  probability  measure  of  x{t) 

given  [26].  As  we  shall  see  below,  it  can  be  written  in  terms  of  a 

conditional  density  which  satisfies  a  linear  stochastic  PDE. 

A  straightforward  calculation  [1]  implies  that 

£{||t(T)  -  =  £{*(p(«(-),T))) 

where  'J'  is  a  functional  on  finite  measures  of  iR"  defined  by 

«(#<)  =  (2.30) 

where  x^(^)  =  lkll^»  ^  ^  any  finite  measure  on  iR”  such  that 

the  quantities  p{x^)  and  p{x)  make  sense. 

Using  these  definitions,  we  can  rewrite  the  cost  corresponding  to  schedule 
“(■) 


JW) 


=  £{«(p(»(.),T))  +^’’  <p(u(.),(),C(«(())  >  dt 
00 

+  >  Xri<T}- 

«=1 


(2.31) 


where  <  p{u{-),t),4>  >  denotes  inner  product  in  L^{1R^),  and  p{u{‘),  •)  unnor¬ 
malized  conditional  density  which  is  the  “information”  state  of  the  equivalent 
fully  observed  stochastic  control  problem. 

The  evolution  equation  for  p(«(*),  •)  is  the  Zakai  equation  from  nonlinear 
filtering  theory. 

Let 


i^idxi  'dxj  ^idxi 


(2.32) 


Introducing  the  notation 
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Then 


6(x,i/) 


Ri  ^h^{x)x{u}{i) 


R-Jh^{x)x{u}{M) 


(2.33) 


dp{u{-),t)  =  L^p{u{-),t)dt+p{u{-),t)6{-,u{t)fdy{t,u{‘)) 
p(u(-),0)  =  po,  (2.34) 

where  y(t,  u{t))  is  the  observation  process  as  a  function  of  the  sensor  schedule. 

Thus,  the  “state”  of  the  optimal  sensor  scheduling  problem  satisfies  a 
stochastic  partial  differential  equation  forced  by  the  “controlled”  observa¬ 
tions  process  dy{t,u{')).  The  “controls”  u{-)  are  the  selections  of  the  sensor 
configurations  as  a  function  of  time.  This  formulation  of  the  optimization 
problem  involves  complete  observations  of  the  (infinite  dimensional)  state 
p((u(-),t).  Using  it,  we  can  define  optimality  conditions  using  an  extension 
of  the  classical  theory  of  impulse  control  problems  [10]. 

2.1.2  Linear  Gaussian  Systems 

For  the  linear  case,  as  modeled  in  Phase  I,  the  Zakai  equation  reduces  to 
the  standard  Kalman  filter.  Thus  if  the  state  x{t)  and  observations  y{t) 
are  described  by  equations  (1’)  and  (2’),  and  if  we  take  <f>{x)  =  x  in  the 

cost  function,  then  the  estimate  x{t)  (the  conditional  mean  of  x{t)  given  the 

observations)  and  the  error  covariance  (matrix)  a{t)  are  given  by 

dx{t)  =  {A-  <7H'R~^H)xdt  +  aH'R-^dy{t)  i(0)  =  E[x],  (2.35) 

a{t)  =  {BB'  —  aH' R~^Ha)  -I-  Aa  -I-  oA'  a(0)  =  cov(x).  (2.36) 
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Hence,  p{u{-),t)  is  Gaussian  and  is  given  by 

p{z,u{-),t)  =  M))  ^  exp{-hz-x{u{-),t)ya-^{u{-),t){z-x{u{-),t))) 

(iTTj  ■>  I 

(2.37) 

where  n  =dimension  of  the  state  vector  x{t). 


2.2  Quasi- Variational  Inequalities  for  the  Optimal  Schedule 

Consider  (2.34)  with  fixed  schedule  u{t)  =  j,  and  let  pj  denote  the  corre¬ 
sponding  density  p{-,j).  Then  for  j  6  {1, 2, ... ,  N} 

dpj  =  L*pjdt  +  pjh^^ dz{t),  Pj(0)  =  tt,  (2.38) 

where 

h^'  :=  hO,j).  (2.39) 

We  set 

(2-«) 

where  indicates  the  solution  of  (2.38)  with  initial  value  tt. 

To  simplify  the  statement  and  analysis  of  the  quasi- variational  inequalities 
that  solve  the  optimization  problem,  we  consider  the  case  N  =  2,  that  is,  a 
single  sensor  which  is  scheduled  to  turn  on  or  oflf.  Consider  the  cost  functions 


C,  :=  C(i,.),  i  =  l,2, 

Ki  :=  K(l,2) 

Ki  :=  K(2,l).  (2.41) 


and  the  notation  for  the  expected  cost 


Ci(7r)  =  (Ci,7r) 


(2.42) 
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Consider  now  the  set  of  functionals  Z7i(7r,f),£/2(7r,t)  such  that 

UuU2  €  C(0,T;Ci) 

>0,  U2(;t)>0 

Ux{ir,T)  =  t/2(7r,T)  =  ^(7r) 

(2.43) 


Uii-K.t)  <  ^2{s-t)U2{s){ir)  + J‘^2{>^-t)C2{n)dX 
Vs  >  t 

(2.44) 


i/2(7r,t)  <  ii:2(7r)+l/i(7r,f).  (2.45) 

We  shall  refer  to  (2.43)  as  the  system  of  quasi-variational  inequalities 
(QVI).  In  writing  this  system  we  have  used  the  notation  Ui{s){n)  =  C/,(7r,  s),  i 
1,2. 

This  system  defines  the  “optimal  cost”  for  the  sensor  scheduling  problem 
in  much  the  same  way  that  the  Hamilton-Jacobi-Bellman  dynamic  program¬ 
ming  equation  defines  the  optimal  cost  in  a  conventional  optimal  (stochastic) 
control  problem.  Solution  of  the  QVI’s  not  only  gives  the  optimal  cost,  it  also 
provides  an  optimal  schedule.  This  is  analogous  to  the  case  in  conventional 
control  problems  in  which  the  optimal  control  is  (roughly)  the  gradient  of 
the  optimal  cost  function. 

To  understand  how  the  conditions  define  the  cost,  consider  the  two  sets 
of  inequalities  (2.44)  (2.45).  At  any  time  in  the  course  of  the  interval  [0,T]  it 
is  either  optimal  to  use  the  existing  sensor  configuration,  or  it  is  optimal  to 
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switch  to  another  configuration.  If  either  member  of  the  first  set  of  inequal¬ 
ities  (2.44)  holds  with  equality,  then  the  corresponding  sensor  configuration 
is  optimal;  that  is,  use  the  one  which  holds  with  equality.  This  corresponds 
to  optimal  estimation  with  a  given  sensor.  Equality  in  one  of  the  members 
of  (2.44)  means  that  the  “cost  to  go”  Uj{n,t)  is  equivalent  to  the  Bellman 
dynamic  programming  condition. 

Alternatively,  if  equality  holds  in  one  member  of  (2.45),  then  it  is  optimal 
to  switch.  For  example,  suppose  optimality  holds  in  the  first  member,  then 
it  is  optimal  to  switch  from  sensor  1  to  sensor  2,  incurring  switching  cost 
Ki{Tr)  and  continue  using  sensor  2,  incurring  operating  cost  t72(7r,t).  In  this 
case  inequality  will  hold  in  (2.44). 

2.2.1  QVIs  for  the  Linear  Gaussian  Problem 

Now  consider  our  Phase  I  prototype  problem  of  the  linear  case  where  the 
evolving  conditional  density  is  given  by  (2.37)  with  u{t)  =  j.  Note  that 
in  this  case  the  problem  has  been  reduced  to  finite  dimensions  since  we 
need  only  determine  the  vectors  Xt  and  at,  and  not  the  infinite  dimensional 
“state”  pt{7r,t).  In  this  case  the  initial  density  n  is  parameterized  by  the 
initial  vector  {x,p)  of  the  conditional  mean  and  variance.  The  QVIs  then 
become,  for  i  =  1,2, 

Ui{-,-,t)>0,  U2{’,;t)>0 

U,{x,p,T)  =  U2{x,p,T)  =  n^,p)  =  E\x‘^]-E[xf  =  p 

(2.46) 


U\{x,p,t)  <  ^i{s-t)Ui{s){x,p)  + ^i{X-t)Ci{x,p)dX 

U2{x,p,t)  <  <^2{s -t)U2{s){x,p)  +  $2(A-t)C2(x,p)dA 

Vs  >  t 
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(2.47) 


Ui{x,p,t)  <  Kiix,p)  +  U2{x,p,t) 

U2{x,p,t)  <  K2ix,p)  +  Ui{x,p,t).  (2.48) 

To  solve  the  QVIs  numerically  it  is  necessary  to  rewrite  them  in  differ¬ 
ential  form  (see,  for  example,  [10]  or  [11]  for  details).  First  expand  C/,(x  -f 
dx,  p  -f  dcr,  t  -f-  dt)  about  (x,  p,  t)  to  obtain 


Ui{x  -f-  dx,p  -I-  dad  +  dt)  =  Ui{x,pd)  + 


dUj 

dx 


It 


dxt  -1- 


dUj 

dp 


dat+ 


dUj 

dt 


2  dx^ 


dxt^  +  0{dt) 


where  dxt  and  dct  may  be  obtained  from  the  Kalman  filtering  and  Riccati 
equations  (2.35)  and  (2.36).  Applying  ^,(*)  to  these  expressions  and  substi¬ 
tuting  into  the  QVIs  gives 


-  ^dt  -  ^E[dxt]  -  ^dat  -  [dif]  +  0{dt)  <  Ci{x,p).  (2.49) 

To  simplify  the  notation  somewhat,  consider  a  version  of  the  Phase  I 
prototype  problem  with  dynamics  and  observations  given  by 


dxi  =  —axtdt  +  dwt 
dy\  =  h'Xfdt  -h  dut 

where  a  >  0,  and  are  zero  mean  unit  variance  Gaussian  (Wiener) 
processes,  and 

h)  = 


f  0  *  =  1 

tc  i=2  ' 
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By  varying  the  constants  a  and  c,  the  “signal  to  noise”  ratio  of  the  dynamics 
and  the  observations  may  be  changed. 

To  complete  writing  the  QVIs  in  differential  form,  the  conditional  expec¬ 
tations  of  dx  and  these  being  the  only  stochastic  terms  (depending  on 
the  observations)  in  the  expansion,  need  to  be  calculated. 

First  from  the  Kalman  filtering  equation. 


E  [dxj]  =  E  dt  +  h'a\  dy^ 

=  —A\  dt  E  -I-  h'a\E  [dj/j] 


where 

>!•  = 

Taking  as  the  initial  condition  x{t) 
observation  equation 


{a  +  OT,® 


for  i  =  1, 
for  i  =  2. 

=  X,  then  E[xt]  =  x. 


From  the 


E[dy\]  =  E[h‘xtdt  +  di/t]  =  h'xtdt, 


since  Ut  has  zero  mean. 

Secondly, 

E[{dx\f]  =  E[{A\xtdt)^  -  2h'a\jVtdtdyi  +  {h'a\dy\f.] 


All  but  the  last  term  have  order  greater  than  dt.  Hence, 

E  [(dx’)^]  =  E  [(hV;  dy;y]  -I-  0{dt) 

=  (/iv;y£;[{d2/;)"]+o(dt) 

=  E[[di>f)]+0{dt) 

=  h'a\  dt  +  0{dt) 

Substituting  into  the  QVI  formulation  (2.49)  gives  the  differential  formu¬ 
lation  for  the  Gaussian  case  as 
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dUi 

dt 


-|-  ax 


dUi 

dx 


—  ^1  —  2a<rP^) 


dUi 

dp 


<  Ci{x,p) 


(2.50) 


2,2.2  Numerical  Solution  of  the  QVIs 

In  general,  the  QVIs  cannot  be  solved  analytically,  so  numerical  approxima¬ 
tions  must  be  sought.  One  method  of  doing  this  has  been  formulated  by 
Rofman  and  Gonzalez  [22].  They  show  that  the  solution  to  the  QVIs  for 
optimal  control  problems  with  stopping  times,  and  continuous  and  impulse 
controls,  is  given  as  the  maximum  element  of  the  set  of  subsolutions  (that  is, 
none  of  the  inequalities  need  be  solved  as  equalities).  Rofman  and  Gonzalez 
then  present  a  discretization  procedure  for  finding  these  subsolutions  in  both 
the  stationary  and  non-stationary  cases.  A  summary  of  their  algorithm  is 
given  in  Figure  (2.1). 

To  adapt  the  Rofman-Gonzalez  algorithm  to  our  problem  (a  non-stationary 
terminal  cost  optimal  control  problem**)  two  minor  (and  related)  modifica¬ 
tions  must  be  made. 

First,  the  stopping  time  cost  must  be  replaced  by  the  terminal  cost  con¬ 
dition 

t/.(7r,r)  =  ^(tt)  Vi 

^Recall  our  problem  is  to  generate  the  best  estimate  of  state  at  time  T  by  optimally 
using  some  conhguration  of  sensors  under  defined  cost  conditions. 
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Rofman-Gonzalez  Algorithm 


First  write  QVIs  as: 


For  a  given  grid  of  size  h  and  t>0 

f  ■'  \ 

initialize  Variables: 

“  O'*  Vp  =  a  NX.  Vfl  »  Q,NT 

Start  Time  Loop: 
q  =  NT~\ 


/  ■'  “  ■  ■  ’ 
wf(xp,iq)  -» «i(xp,tq)  ase^O;  and  Hixp.lq)  -»  V{x,»)  as  A  -*0i 

-  -  - 


Figure  2.1:  Flow  diagram  of  the  Rofman-Gonzalez  algorithm  for  solving  QVIs 
of  two  independent  variables. 
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If,  for  example,  tt  is  a  probability  measure  and  we  are  estimating  the  state 
(as  opposed  to  a  given  function  of  the  state),  then  ^(;r)  is  the  variance  with 
respect  to  tt. 

The  second  modification  is  related  to  the  first.  The  Rofman-Gonzalez 
algorithms  must  be  initialized  by  a  known  subsolution  of  the  QVIs.  For  the 
stopping  time  problems  the  trivial  solution 


Ui{x,t)  =  0 

is  always  a  subsolution.  For  the  terminal  cost  problem,  this  trivial  solution 
does  not  usually  satisfy  the  terminal  cost  condition.  To  modify  the  algorithm 
it  is  necessary  to  find  a  suitable  initial  solution.  In  general,  this  will  not  be 
diflScult,  but  will  at  times  introduce  minor  restrictions  on  our  problems. 

For  example,  returning  to  the  linear  Gaussian  prototype  problem,  the 
QVIs  in  differential  form  are  given  by  (2.50)  and  the  boundary  conditions 

Ui{x,p,t)  >  0, 


Ui{x,p,T)  =  ^{x,p), 

Ui{x, p, t)  <  kij  +  Uj{x, p, t)  for  j  ^  i. 

For  this  case,  ^(x,p)  =  var{'K{x,p))  —  p.  Thus  17,  =  0  is  not  a  solution 
if  p  ^  0.  Thus  the  derivative  ^  which  leads  to  the  condition  that 
the  coefficients  (in  the  QVIs)  of  ^  should  be  bounded  by  the  running  cost. 
Noting  that  these  coefficients  are  just  the  (negative)  right  hand  side  of  the 
Riccati  equations  which  implies 


That  is,  any  decrease  of  the  overall  cost  gained  by  the  sensor  lowering  the 
variance  (error)  must  be  less  than  any  cost  incurred  running  the  sensor.  Thus 
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this  “restriction”  is  simply  a  consistency  condition  which  is  satisfied  by  any 
solution.  Hence  this  condition  states  that  an  initial  condition  must  in  fact 
be  a  solution. 

2.2.3  Computation  of  the  Optimal  Schedule 

Given  the  solutions  Uj{ir,t),j  =  l,2,t  G  [0,T],  then  the  optimal  sensor 
schedule  is  constructed  in  terms  of  the  continuation  and  switching  sets  asso¬ 
ciated  with  the  solution.  That  is,  at  any  given  time  t  G  [0,  T]  it  is  optimal 
to  “continue”  with  the  existing  sensor  -  equality  holds  in  one  of  (2.44),  and 
inequality  holds  in  both  of  (2.45).  At  a  point  t  where  equality  holds  in  one 
of  (2.45),  it  is  optimal  to  switch.  Bearing  in  mind  that  the  trajectories  fol¬ 
lowed  by  the  costs  Uj{Tr,t),j  =  l,2,t  G  [0,T]  depend  on  the  underlying  state 
p(«(-),f),  the  points  in  the  state  space  at  which  the  members  of  (2.45)  hold 
with  equality  define  the  boundaries  of  the  continuation  sets.  The  bound¬ 
aries  are  themselves  the  switching  sets.  When  the  state  p{u{-),t)  intersects 
a  boundary,  it  is  optimal  to  switch.® 

To  see  that  this  is  the  case,  suppose  that  u{0)  =  1  (we  start  out  using 
sensor  1).  Then  define 

=  inf{f/i(pi  (<),<)  =  Ki{pi{t))  +  U2{pi{t)J)}  (2.51) 

This  is  the  first  time  at  which  it  is  optimal  to  switch  (from  1  to  2).  We  write 

p*{t)=pi{t),  tG[0,r;].  (2.52) 

which  is  the  state  (conditional  density)  during  the  (initial)  interval  that  sen¬ 
sor  1  is  in  use. 

^The  paper  [27]  contains  the  explicit  computation  of  switching  and  continuation  sets 
for  a  class  of  simple  QVI’s. 
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Next  define 

r;  =  t4(p2((),()  =  K2{p2{t))  +  U,Mt)))  (2.53) 

This  is  the  first  time  at  which  it  is  optimal  to  switch  (back  to  1  from  2). 

Continuing  this  process,  we  construct  a  sequence  of  stopping  times  < 
Tj  <  r3  <  •  •  •  which  defines  an  optimal  sensor  schedule. 

For  example,  returning  to  our  Phase  I  prototype  problem.  Figure  (2.2) 
shows  the  solutions  C/j(x,0.5,t).  These  solutions  represent  [/,  with  the  initial 
condition  of  starting  with  the  sensor  either  oflT  (z  =  0)  or  on  (i  =  1)  for 
p  =  0.5,  which  is  the  steady  state  error  when  no  sensor  is  used  (sensor  is 
off).  For  these  solutions,  there  was  no  cost  incurred  to  shut  off  the  sensor, 
and  running  and  switching  costs  were  considered  low.  The  sensor  schedule 
is  as  expected.  That  is,  regardless  of  whether  the  sensor  was  off  or  on  at  the 
start  of  the  interval,  it  is  off  (white  region)  until  the  end  of  the  interval  at 
which  point  it  is  turned  on  (black  region®).  The  sensor  is  turned  on  at  the 
end  of  the  interval  for  a  time  interval  that  will  lower  the  error  (variance)  to 
approximately  the  steady  error  (variance)  of  the  sensor’s  on  state  balance  by 
the  sensor’s  running  and  switching  costs.  For  example,  if  these  costs  are  too 
high,  the  sensor  will  remain  off. 

In  higher  dimensions  the  problems  and  solutions  become  more  compli¬ 
cated  since  either  individual  sensors  or  combinations  of  sensors  may  be  on. 
A  simple  example  taken  from  the  Phase  I  work  is  given  in  Figure  (2.3)  for 
two  sensors.  The  four  diagrams  represent  sensor  schedules  which  initially 
start  in  one  of  the  four  configuration  (i  =  1  both  sensors  off  to  *  =  4  both 
sensors  on^).  In  this  example,  sensor  two  has  five  times  the  gain  of  sensor 

^Throughout  these  graphs,  there  is  not  a  consistency  of  color  representation  of  the 
sensors.  This  is  an  artifact  of  the  plotting  software  used.  Please  see  the  text  of  the 
interpretation  of  the  schedules. 

^  Again  note  that  colors  may  not  be  consistent  between  diagrams. 
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one  but  costs  two  orders  of  magnitude  greater  to  run.  Switching  costs  for 
both  sensors  are  moderate  with  no  cost  incurred  for  turning  a  sensor  off.  But 
note  that  the  steady  state  solutions  of  the  Riccati  equations  for  all  cases  are 
very  close,  making  the  sensitivity  to  the  switching  cocts  greater. 

For  *  =  1  (sensor  initially  off),  the  sensor  generally  stays  off  (black  region) 
except  for  a  small  (white)  region  where  the  more  powerful,  and  more  costly, 
sensor  is  turned  on  for  a  limited  time  when  the  initial  state  variable  (x)  is 
low  and  likely  to  be  lost  is  noise.  The  same  behavior  occurs  for  t  =  2  where 
the  more  powerful  sensor  is  initially  on.  In  this  case  the  white  region  (sensor 
one  on)  is  slightly  large  since  no  “on”  switching  cost  is  incurred. 

For  the  other  two  cases,  there  is  a  dramatic  change.  For  i  =  3  (the  white 
region)  and  i  =  4  (the  gray  region)  the  second  (weak  but  low  cost  sensor)  is 
always  on.  That  is,  if  no  switching  cost  is  incurred  for  turning  this  sensor  on 
it  will  continue  to  run  due  to  its  low  running  cost.  For  i  =  4,  where  the  first 
sensor  is  also  initially  on,  it  will  stay  on  for  a  short  period  to  lower  estimate 
error  where  needed  but  quickly  shuts  down  due  to  its  large  running  cost. 
This  example  indicates  the  importance  of  defining  good  cost  functions. 
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2.3  Simplified  Algorithms  for  Gaussian  State  and  Sensor  Models: 
An  Alternate  Approach 

In  this  section  we  solve  the  sensor  scheduling  problem  when  the  state  and 
sensor  observations  satisfy  linear  gaussian  models  using  an  alternate  formu¬ 
lation  which  may  be  used  to  validate  the  QVI  formulation.  As  before,  the 
optimal  state  estimate  is  obtained  via  Kalman  Filtering.  We  assume  that 
the  running  cost  associated  with  each  sensor  is  a  constant  times  the  duration 
the  sensor  is  used  and  that  a  constant  cost  is  associated  with  turning  on 
the  sensor.  Under  these  assumptions,  the  optimal  sensor  schedule  is  a  single 
interval  sensor  sr^  '•.d  Ae  policy.  In  fact,  the  scheduling  of  the  sensors  reduces 
to  the  schedulir  >,  of  sensor  turn-on  times,  since  once  a  sensor  is  turned  on, 
it  remains  on  until  the  terminal  time  T. 

Scalar  State  and  Sensor  Models:  Assume  that  the  state  and  two  sensor 
observation  models  obey  the  following  scalar  equations: 

State  dx  =  a  X  dt b  dw 
Sensor  1  dyi  =  cixdt  +  dv 

Sensor  2  dy2  =  c^xdt  +  dv. 

Here  we  have  assumed  both  observation  noises  have  unit  variance.  We  do 
this  because  the  effect  of  increased  or  decreased  observation  noise  can  be 
modeled  by  an  appropriate  change  in  the  observation  coefficient  c,. 

A  sensor  schedule  consists  of  two  sets  of  intervals  /,  =  { (oj ,  6'i ) , . . . ,  {a'^. 
i  =  1,2  where  sensor  i  is  turned  on  at  time  o’  and  turned  off  at  time 
b'j  for  j  =  l,...,nj.  (We  assume  that  the  intervals  are  ordered  such  that 
0<oi<6i<a4<i4..-a;,<6;,<T.) 

The  sensor  scheduling  problem  seeks  to  find  a  sensor  schedule  which  min- 
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imizes  the  cost  function 

J(Iuh)  =  Eax{T)-x(T)f)+h  +  f;(6?-a?)  +  niSi  +  n2S2. 

i=i  j=i 

Here  A;,-  is  cost  for  running  sensor  i  one  time  unit,  n,  s,-  is  the  cost  for  switching 
sensor  i  on  n,  times,  and  x{T)  is  the  conditional  expectation  of  x{T)  given 
the  observations  up  to  T. 

The  first  term  in  the  cost  function  is  the  conditional  covariance  of  the 
state,  at  time  T,  given  the  observations.  It  is  well  known  that  the  conditional 
covariance  of  x  given  the  observations  up  to  time  t  satisfies  a  scalar  Riccati 
equation 

/m  1  P«(^>^o;Po)  =  2ap{t,to;po) + 

(Pi{to,tQ]PQ)  =Po. 

If  both  sensors  are  on,  the  conditional  covariance  obeys  equation  (R) 
with  parameters  {a,  b,yjci  +  We  let  P3(t,to;Po)  denote  the  conditional 
covariance  when  both  sensors  are  on. 

The  search  for  the  optimal  sensor  schedule  over  the  class  of  all  possible 
sensor  schedules  can  be  reduced  to  the  search  of  the  optimal  sensor  schedule 
over  a  much  smaller  set  of  schedules.  In  fact,  it  can  be  shown  that  the  search 
space  for  the  optimal  sensor  schedule  problem  can  be  restricted  to  a  search 
over  the  set  of  single  interval  sensor  schedules. 

A  single  interval  sensor  schedule  is  a  sensor  schedule  where  n,-  =  1,  i  = 
1,2.  That  is,  a  schedule  where  each  sensor  is  on  at  most  one  time  interval. 

We  now  state  an  important  result  used  to  find  optimal  sensor  schedules 
for  this  case. 

Fact:  Assume  that  the  state  and  observations  for  each  sensor  satisfy  the 
equations  given  above.  Given  any  sensor  schedule,  S  =  with  n,  ^  0, 

there  exists  a  single  interval  sensor  schedule,  S  =  {/i,  /a},  rvhich  has  cost  no 
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greater  that  S,  i.e., 


To  see  how  this  develops,  let  /]■  =  (6’-  —  ap  and  define 

i=K) 

It  is  clear  the  S  incurs  the  same  running  cost  as  S  since  both  have  their 
sensors  on  for  the  same  amount  of  time.  It  is  also  clear  that  S  has  no 
greater  switching  cost  than  S  since  S  switches  at  least  once  for  each  sensor. 
It  remains  to  be  shown  that  S  has  no  greater  conditional  covariance.  The 
complete  proof  relies  on  properties  of  solutions  to  the  Riccati  equation. 


Finding  the  Optimal  Sensor  Schedule:  We  know  that  if  a  sensor  is 
switched  on  it  will  be  switched  off  at  time  T.  Hence  determining  the  optimal 
single  interval  sensor  schedule  reduces  to  finding  the  optimal  sensor  switch 
on  times. 

The  scalar  Riccati  equation  can  be  solved  by  separation  of  variables.  The 
solution  is  given  by 


( {Ai  coth(>ljt  +  Wi)  +  a)/cf 


Pi{t-,Po)  =  { 


1.  {Ai  tanh(v4jt  -f-  Uj)  +  o)/c? 


po  >  {Ai  -h  a)/c? 


otherwise 


where 


Ai  =  \ja^  + 

Wi  =  coth  ( - r - ) 

Ai 


Vi  =  tanh  ( 


Ai 
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In  order  to  compute  some  of  the  optimal  sensor  schedules  we  will  need  to 
compute  the  derivative  of  p,(t;po)  with  respect  tc  po 


<^p.(*;po) 

dpo 


_ i _ 'i 

-  (c?Po  -  o)V 

_ I _ ) 

2  apo  -f-  62  _  c2p2  j 


Let  ti  be  the  switch-on  time  for  Sensor  1  and  <2  be  the  switch-on  time 
for  Sensor  2.  Let  po  =  — 6^/2a.  The  optimal  sensor  schedule  is  found  by 
enumerating  the  six  possible  switching  cases,  computing  there  corresponding 
optimal  costs  and  picking  the  case  with  the  lowest  overall  cost.  The  six 
switching  cases  are  (1)  tx  <  T,  <2  =  T;  (2)  ti  =  T,  <2  <  T;  (3a)  0  <  <  ^2  < 
T,  (3b)  ti  =  0,  f2  <  T;  (4a)  <2  <  <  T,  (4b)  ti  <  T,  <2  =  0;  (5)  =  <2  <  T; 
(6)  ti  =t2  =  T,  no  switching. 
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Cases  (1)  and  (2) 

Let  Ati  =  T  —  ti,  then  the  cost  is  given  by 

Ji{Ati)  =  pi{Ati;po)  -H  At,-  fcj  +  Sj 
setting  the  derivative  with  respect  to  At,-  equal  to  zero  yields 

p,(Ati;po)  =  -fc.. 

Cases  (3a)  and  4(a) 

Let  P3(t;po)  be  the  solution  to  the  Riccati  equation  when  both  sensors 
are  operating  (i.e.,  c  =  y/ci  +  c^).  We  will  compute  the  solution  for  Case 
(3a),  a  change  in  notation  gives  the  solution  for  Case  (4a). 

Let  Ati2  =  ^2  ~  ^1  and  At2  =  T  —  The  cost  to  minimize  is  given  by 

J(Ati2,  At2)  =  P3(At2;pi(Ati2;po))  +  Ati2A;i  -f  At2fc2  +  si  +  52. 


Taking  partial  derivatives  of  J  with  respect  to  Ati2  and  At2  and  setting  them 
to  zero  yields  the  two  equations: 


Pi(Afi2;po) 


ki  (cf  -|-(^)pf(Ati2;po) 
P3(At2;Pi(Ati2;po)) 


P3(At2;pi(Ati2;po))  =  -(^1  +  k2) 


Cases  (3b)  and  (4b) 

In  this  case  the  cost  is  given  as  a  function  of  a  single  time  t  since  the 
other  sensor  is  always  on.  Again,  we  will  work  out  case  (3b).  The  other  case 
is  obtained  by  the  appropriate  notation  changes. 

Let  At2  =T  —  <2-  The  cost  to  minimize  is  given  by 


J (At2)  =  P3(At2!  Pi  (T  —  At2))  -^-T  ki  +  A<2  ^2  +  5i  -I-  S2 


Taking  the  derivative  of  J  with  respect  to  At2  and  setting  the  result  equal 
to  zero  yields 
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with  P3  =  p3(At;pi(r  -  At\po))  and  pi  =  pi(T  -  At;po). 

Case  5 

Let  P3(<;po)  be  as  defined  above.  Let  At  =  T  —  t.  The  cost  to  minimize 
is  given  by 

J{At)  =  P3(At;po)  -f-  At{ki  -|-  ^2)  -H  si  -I-  S2 

Taking  derivative  of  J  with  respect  to  At  and  setting  the  result  equal  to  zero 
yields 

P3(Ai;po)  =  -{ki+k2). 

Again,  to  solve  find  the  optimal  sensor  schedule,  solve  for  the  optimal 
switch-on  times  in  each  of  the  cases  above,  compute  the  corresponding  cost 
and  pick  the  times  which  result  in  the  lowest  overall  cost. 

Example:  Let  a  =  — 1,6=1,  ci=:l,  and  C2  =  5.  The  optimal  sensor 
schedule  depends  on  the  values  of  the  running  costs  ki ,  k2  and  the  switching 
costs  si,  S2-  A  graph  of  the  optimal  schedule  switching  curves  is  given  in 
Figure  (2.4).  In  this  example  there  is  no  switching  cost  for  either  sensor  and 
the  running  cost  for  each  sensor  range  from  exp(— 8)  to  exp(4).  The  graph 
is  shown  in  log  scale.  Regions  with  the  same  sensor  policy,  i.e.,  t2  <ti  <  T, 
are  shaded  with  the  same  color. 

We  see  from  the  graph  that  when  the  running  cost  for  each  sensor  is  low, 
relative  to  the  conditional  covariance,  then  both  sensors  are  turned  on.  We 
also  see  that  as  the  running  cost  for  a  sensor  is  increased  its  likelihood  of 
being  used  is  decreased,  eventually  the  sensor  is  too  expensive  and  hence  not 
used. 

These  results  have  an  interesting  interpretation.  Not  only  do  they  tell 
how  to  schedule  the  available  sensors  in  order  to  make  the  optimal  decision 
in  T  time  units,  but  they  also  indicate  the  minimum  desirable  duration  for 
the  optimal  schedule.  That  is,  the  smallest  time  necessary  to  obtain  the 
optimal  state  estimate. 
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Figure  2.4:  Switching  Curves  for  Example  1. 
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